AITopics | learnable prompt

Collaborating Authors

learnable prompt

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

23Continual LearningSeparationBinding

Neural Information Processing SystemsJun-16-2026, 02:06:06 GMT

However, real-world videos typically exist as continu-ously evolving data streams (e.g., dynamic scenes captured by wearable glasses),necessitating models to continually adapt to shifting data distributions and novelscenarios. Considering the prohibitive computational costs of fine-tuning modelson new tasks, usually, a small subset of parameters is updated while the bulkof the model remains frozen. This poses new challenges to existing continuallearning frameworks in the context of large multimodal foundation models, i.e.,catastrophic forgetting and update conflict. While the foundation models strug-gle with parameter-efficient continual learning, the hippocampus in the humanbrain has evolved highly efficient mechanisms for memory formation and con-solidation. Inspired by the rapid Binding and pattern separation mechanisms inthe hippocampus, in this work, we propose Bisecle for video-language continuallearning, where a multi-directional supervision module is used to capture morecross-modal relationships and a contrastive prompt learning scheme is designedto isolate task-specific knowledge to facilitate efficient memory storage. Bindingand separation processes further strengthen the ability of VLMs to retain complexexperiences, enabling robust and efficient continual learning in video understandingtasks. We perform a thorough evaluation of the proposed Bisecle, demonstratingits ability to mitigate forgetting and enhance cross-task generalization on severalVideoQA benchmarks.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection

Neural Information Processing SystemsMar-20-2026, 22:55:17 GMT

Detecting Human-Object Interactions (HOI) in zero-shot settings, where models must handle unseen classes, poses significant challenges. Existing methods that rely on aligning visual encoders with large Vision-Language Models (VLMs) to tap into the extensive knowledge of VLMs, require large, computationally expensive models and encounter training difficulties. Adapting VLMs with prompt learning offers an alternative to direct alignment. However, fine-tuning on task-specific datasets often leads to overfitting to seen classes and suboptimal performance on unseen classes, due to the absence of unseen class labels. To address these challenges, we introduce a novel prompt learning-based framework for Efficient Zero-Shot HOI detection (EZ-HOI). First, we introduce Large Language Model (LLM) and VLM guidance for learnable prompts, integrating detailed HOI descriptions and visual semantics to adapt VLMs to HOI tasks. However, because training datasets contain seen-class labels alone, fine-tuning VLMs on such datasets tends to optimize learnable prompts for seen classes instead of unseen ones. Therefore, we design prompt learning for unseen classes using information from related seen classes, with LLMs utilized to highlight the differences between unseen and related seen classes. Quantitative evaluations on benchmark datasets demonstrate that our EZ-HOI achieves state-of-the-art performance across various zero-shot settings with only 10.35\% to 33.95\% of the trainable parameters compared to existing methods.

artificial intelligence, large language model, natural language, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

654f61ecd998c9095d30d42c03b832aa-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 11:55:39 GMT

detection, hoi class, proceedings, (13 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Singapore (0.04)
North America > United States > Mississippi (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

d6938c8e88ef62394d2f4f3fd428e036-Paper-Conference.pdf

Neural Information Processing SystemsNov-20-2025, 04:52:05 GMT

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
Oceania > Australia (0.04)
North America > United States (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law (0.46)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Doubly Debiased Test-Time Prompt Tuning for Vision-Language Models

Song, Fei, Li, Yi, Wang, Rui, Zhou, Jiahuan, Zheng, Changwen, Li, Jiangmeng

arXiv.org Artificial IntelligenceNov-18-2025

Test-time prompt tuning for vision-language models has demonstrated impressive generalization capabilities under zero-shot settings. However, tuning the learnable prompts solely based on unlabeled test data may induce prompt optimization bias, ultimately leading to suboptimal performance on downstream tasks. In this work, we analyze the underlying causes of prompt optimization bias from both the model and data perspectives. In terms of the model, the entropy minimization objective typically focuses on reducing the entropy of model predictions while overlooking their correctness. This can result in overconfident yet incorrect outputs, thereby compromising the quality of prompt optimization. On the data side, prompts affected by optimization bias can introduce misalignment between visual and textual modalities, which further aggravates the prompt optimization bias. To this end, we propose a Doubly Debiased Test-Time Prompt Tuning method. Specifically, we first introduce a dynamic retrieval-augmented modulation module that retrieves high-confidence knowledge from a dynamic knowledge base using the test image feature as a query, and uses the retrieved knowledge to modulate the predictions. Guided by the refined predictions, we further develop a reliability-aware prompt optimization module that incorporates a confidence-based weighted ensemble and cross-modal consistency distillation to impose regularization constraints during prompt tuning. Extensive experiments across 15 benchmark datasets involving both natural distribution shifts and cross-datasets generalization demonstrate that our method outperforms baselines, validating its effectiveness in mitigating prompt optimization bias.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.1169

Country:

Europe (1.00)
North America > United States > California (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Point-PRC: A Prompt Learning Based Regulation Framework for Generalizable Point Cloud Analysis Hongyu Sun

Neural Information Processing SystemsOct-10-2025, 18:02:38 GMT

Recent works demonstrate the performances of 3D point cloud recognition can be boosted remarkably by parameter-efficient prompt tuning. However, we observe that the improvement on downstream tasks comes at the expense of a severe drop in 3D domain generalization.

computer vision, generalization, proceedings, (13 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
Oceania > Australia (0.04)
North America > United States (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law (0.46)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

654f61ecd998c9095d30d42c03b832aa-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 04:39:16 GMT

detection, hoi class, proceedings, (13 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Singapore (0.04)
North America > United States > Mississippi (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Bisecle: Binding and Separation in Continual Learning for Video Language Understanding

Tan, Yue, Hu, Xiaoqian, Xue, Hao, De Melo, Celso, Salim, Flora D.

arXiv.org Artificial IntelligenceJul-2-2025

Frontier vision-language models (VLMs) have made remarkable improvements in video understanding tasks. However, real-world videos typically exist as continuously evolving data streams (e.g., dynamic scenes captured by wearable glasses), necessitating models to continually adapt to shifting data distributions and novel scenarios. Considering the prohibitive computational costs of fine-tuning models on new tasks, usually, a small subset of parameters is updated while the bulk of the model remains frozen. This poses new challenges to existing continual learning frameworks in the context of large multimodal foundation models, i.e., catastrophic forgetting and update conflict. While the foundation models struggle with parameter-efficient continual learning, the hippocampus in the human brain has evolved highly efficient mechanisms for memory formation and consolidation. Inspired by the rapid Binding and pattern separation mechanisms in the hippocampus, in this work, we propose Bisecle for video-language continual learning, where a multi-directional supervision module is used to capture more cross-modal relationships and a contrastive prompt learning scheme is designed to isolate task-specific knowledge to facilitate efficient memory storage. Binding and separation processes further strengthen the ability of VLMs to retain complex experiences, enabling robust and efficient continual learning in video understanding tasks. We perform a thorough evaluation of the proposed Bisecle, demonstrating its ability to mitigate forgetting and enhance cross-task generalization on several VideoQA benchmarks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.00469

Country: Oceania > Australia > New South Wales (0.14)

Genre: Research Report > Promising Solution (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Prompt-Tuned LLM-Augmented DRL for Dynamic O-RAN Network Slicing

Lotfi, Fatemeh, Rajoli, Hossein, Afghah, Fatemeh

arXiv.org Artificial IntelligenceJun-3-2025

Modern wireless networks must adapt to dynamic conditions while efficiently managing diverse service demands. Traditional deep reinforcement learning (DRL) struggles in these environments, as scattered and evolving feedback makes optimal decision-making challenging. Large Language Models (LLMs) offer a solution by structuring unorganized network feedback into meaningful latent representations, helping RL agents recognize patterns more effectively. For example, in O-RAN slicing, concepts like SNR, power levels and throughput are semantically related, and LLMs can naturally cluster them, providing a more interpretable state representation. To leverage this capability, we introduce a contextualization-based adaptation method that integrates learnable prompts into an LLM-augmented DRL framework. Instead of relying on full model fine-tuning, we refine state representations through task-specific prompts that dynamically adjust to network conditions. Utilizing ORANSight, an LLM trained on O-RAN knowledge, we develop Prompt-Augmented Multi agent RL (PA-MRL) framework. Learnable prompts optimize both semantic clustering and RL objectives, allowing RL agents to achieve higher rewards in fewer iterations and adapt more efficiently. By incorporating prompt-augmented learning, our approach enables faster, more scalable, and adaptive resource allocation in O-RAN slicing. Experimental results show that it accelerates convergence and outperforms other baselines.

large language model, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2506.00574

Genre: Research Report > New Finding (0.34)

Technology: